A problem with the correlation coefficient as a measure of gene expression divergence.

نویسندگان

  • Vini Pereira
  • David Waxman
  • Adam Eyre-Walker
چکیده

The correlation coefficient is commonly used as a measure of the divergence of gene expression profiles between different species. Here we point out a potential problem with this statistic: if measurement error is large relative to the differences in expression, the correlation coefficient will tend to show high divergence for genes that have relatively uniform levels of expression across tissues or time points. We show that genes with a conserved uniform pattern of expression have significantly higher levels of expression divergence, when measured using the correlation coefficient, than other genes, in a data set from mouse, rat, and human. We also show that the Euclidean distance yields low estimates of expression divergence for genes with a conserved uniform pattern of expression.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of Blasting Cost in Limestone Mines Using Gene Expression Programming Model and Artificial Neural Networks

The use of blasting cost (BC) prediction to achieve optimal fragmentation is necessary in order to control the adverse consequences of blasting such as fly rock, ground vibration, and air blast in open-pit mines. In this research work, BC is predicted through collecting 146 blasting data from six limestone mines in Iran using the artificial neural networks (ANNs), gene expression programming (G...

متن کامل

خوشه‌بندی داده‌های بیان‌ژنی توسط عدم تشابه جنگل تصادفی

Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...

متن کامل

Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis

Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...

متن کامل

"Divergence problem" in estimating temperature based on tree rings (Case study: Juniper mountainous habitats in northern Kerman province)

Introduction The study of tree rings is one of the most widely used methods of climate reconstruction for centuries and millennia, but the occurrence of climatic anomalies such as global warming in recent decades has caused divergence problem in the series of tree rings in some areas. Which challenges the ability of this proxy to regenerate the climate. The “divergence problem” is the differen...

متن کامل

Effect of salicylic acid on stevioside and rebaudioside A production and transcription of biosynthetic genes in in vitro culture of Stevia rebaudiana

S. rebaudiana produces steviol glycosides including stevioside and rebaudioside A that are valuable as low calorie sweeteners. The objective of this study was to investigate the effects of salicylic acid elicitation and sampling times on the improvment of stevioside and rebaudioside A production and KA13H, UGT74G1 and UGT76G1 genes expression. The results showed that the addition of different c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genetics

دوره 183 4  شماره 

صفحات  -

تاریخ انتشار 2009